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ABSTRACT 

We  have  recently  proposed  a  new  method  for 
combined  design  optimization  and  calibration-based 
validation  using  a  sequential  approach  with  variable- 
size  local  domains  of  the  design  space  and  statistical 
bootstrap  techniques.  Our  work  was  motivated  by  the 
fact  that  model  validation  in  the  entire  design  space 
may  be  neither  affordable  nor  necessary.  The  method 
proceeds  iteratively  by  obtaining  test  data  at  a  design 
point,  constructing  around  it  a  local  domain  in  which 
the  model  is  considered  valid,  and  optimizing  the 
design  within  this  local  domain.  Due  to  test  variability, 
it  is  important  to  know  how  many  tests  are  needed  to 
size  each  local  domain  of  the  sequential  optimization 
process.  Conducting  an  unnecessarily  large  number  of 
tests  may  be  inefficient,  while  a  small  number  of  tests 
may  be  insufficient  to  achieve  the  desired  validity  level. 
In  this  paper,  we  introduce  a  technique  to  determine  the 
number  of  tests  required  to  account  for  their  variability 
by  sizing  the  local  domains  accordingly.  The  goal  is  to 
achieve  a  desired  level  of  model  validation  in  each 
domain  using  the  correlation  between  model  data  at  the 
center  and  any  other  point  in  the  local  domain.  The 
proposed  technique  is  illustrated  by  means  of  a  piston 
design  example. 

1.  INTRODUCTION 

A  major  challenge  in  engineering  design 
optimization  is  that  the  computational  and/or  simulation 
models  utilized  for  analysis  to  quantify  objective  and 
constraint  values  (and  their  sensitivities  with  respect  to 
design  variables  and  parameters)  must  remain 
adequately  accurate  in  very  large  design  spaces,  both  in 
terms  of  dimensionality  and  range.  In  the  engineering 
design  community,  adequate  accuracy  of  the  models  has 
been  assessed  by  validation  approaches  that  compare 


model  predictions  to  experimental  data.  Model 
validation  for  specific  analysis  purposes  typically 
presents  significantly  less  computational  challenges  as 
such  models  are  developed  with  the  intention  to  be 
utilized  in  the  context  of  a  relatively  narrow  scope.  We 
believe  that  the  approach  to  model  validation  for  design 
optimization  however,  as  used  predominantly  by  the 
engineering  design  community,  has  several  limitations: 

1.  Model  predictions  are  compared  to  experimental 
data  at  a  small  number  of  points.  Typically,  this 
may  be  inadequate  for  large  number  of  model 
inputs  (dimensionality)  that  constitute  design 
optimization  variables  and  parameters,  and  can 
take  values  from  large  intervals  (range). 

2.  Model  predictions  are  typically  compared  at  points 
for  which  experimental  data  are  already  available; 
new  tests  are  usually  not  conducted.  However, 
these  points  may  not  be  relevant  to  the  optimal 
design,  which  is  determined  by  intersecting  active 
constraint  boundaries  that  may  lie  far  away  from 
the  test  points. 

3.  If  new  tests  are  conducted  for  model  validation, 
they  are  all  performed  before  the  optimization 
process.  Typically,  the  test  points  are  distributed 
across  the  design  space  in  an  attempt  to  fill  it 
according  to  the  design-of-experiments  paradigm. 
However,  numerical  optimization  algorithms 
generate  a  sequence  of  design  candidates  (or 
iterates)  until  they  converge  to  an  optimal  design 
according  to  termination  criteria,  desired  accuracy 
and/or  computational  budget  (e.g.,  maximum 
number  of  iterations).  They  rarely  visit  the  entire 
design  space.  Thus,  testing  at  points  distributed 
over  the  entire  design  space  may  be  irrelevant,  and 
thus  wasteful.  We  think  that  this  is  true  even  when 
using  optimization  algorithms  that  evaluate  points 
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in  the  entire  design  space  in  an  attempt  to  find 

good  local  minima,  because  the  test  cost  can 

outweigh  these  attempts. 

To  address  the  above  issues,  we  have  been 
presenting  in  a  series  of  recent  publications  (Li  et  al., 
2010;  Drignei  et  al,  2012a  and  2012b),  the  building 
blocks  of  a  new  methodology  for  sequential  design 
optimization  that  utilizes  calibration-based  model 
validation  in  local  (sub)domains  of  the  design  space. 
The  above  publications  also  provide  a  comprehensive 
literature  review  on  model  validation.  Our  new 
methodology  proceeds  iteratively  by  obtaining  test  data 
at  a  design  point,  constructing  around  it  a  local  domain 
in  which  the  model  is  considered  valid,  and  optimizing 
the  design  within  this  local  domain.  If  the  optimal 
design  associated  with  the  local  domain  lies  on  the 
boundary  of  the  latter,  a  new  iteration  is  conducted  with 
the  optimal  design  serving  as  the  new  center  point.  This 
process  either  converges  if  the  optimal  design  of  a  local 
domain  lies  in  the  interior  of  the  latter  or  is  terminated 
if  testing  resources  are  exhausted.  In  the  second  case, 
we  are  still  left  with  a  design  that  is  optimal  with 
respect  to  a  part  of  the  design  space. 

The  premise  of  this  paradigm  is  that  testing 
resources  are  utilized  more  efficiently.  We  recognize 
that  the  new  paradigm  may  be  conflicting  with  the 
automation  of  the  design  optimization  process,  as  it 
requires  the  temporary  interruption  of  the  design 
optimization  process  when  new  tests  become  necessary. 
However,  we  emphasize  that  current  practice  can  lead 
to  invalid  and  bad  designs  because  of  the  three 
arguments  we  presented  above,  as  demonstrated  with  an 
academic  example  in  Li  et  al.  (2010).  We  must  decide 
whether  we  want  "automated"  results  that  may  be 
erroneous  or  "semi-automated"  design  solutions  that 
utilize  resources  efficiently  and  increase  the  likelihood 
of  performing  as  predicted  when  built.  Moreover,  the 
proposed  paradigm  enables  the  consideration  of 
additional  model  inputs  as  variables  during  the 
optimization  process.  It  also  enables  robustness  studies 
with  respect  to  model  inputs  that  are  design  parameters 
since  testing  resources  are  not  exhausted  a-priori  but 
managed  and  allocated  as  the  design  optimization 
process  progresses. 

In  section  2  we  review  our  previous  research  on 
design  optimization  using  local  domains.  In  section  3 
we  develop  the  statistical  models,  the  method  for 
choosing  the  number  of  tests  and  the  construction  of 
local  domains.  Section  4  presents  a  description  of  the 
piston  example,  on  which  we  apply  the  proposed 
methodology.  We  summarize  in  Section  5. 

2.  BRIEF  METHODOLOGY  REVIEW 

In  Li  et  al.  (2010)  we  first  proposed  the  idea  of 
validating  computational  models  via  calibration  as  the 
design  optimization  process  progresses  using  a 
sequential  approach  that  resembles  trust-region 


methods.  In  this  first  research  effort,  we  assumed  that 
data  can  be  obtained  for  each  of  the  local  domains  that 
are  defined  during  the  sequential  optimization  process. 
Moreover,  we  held  the  size  of  the  local  domains  (whose 
shapes  were  hyper-rectangles)  fixed,  and  used  interval- 
based  Bayesian  hypothesis  testing  to  compute  model 
validity  at  the  center  of  each  local  domain.  Our 
combined  design  optimization  and  validation  method 
was  based  on  calibrating  model  parameters  (if 
necessary)  to  ensure  that  model  validity  would  not  fall 
below  a  threshold.  Although  this  first  method  did  not 
account  for  test  or  model  uncertainty  directly,  and 
assumed  fixed  local  domains,  a  contrived  cantilever 
beam  design  example  we  used  to  demonstrate  our  idea 
highlighted  the  fact  that  a-priori  global  validation  can 
lead  to  invalid  and  worse  designs,  while  the  proposed 
sequential  design  optimization  approach  using 
calibration-based  validation  in  local  domains  converged 
to  a  valid  and  better  design. 

In  Drignei  et  al.  (2012a),  we  improved  our 
methodology  by  varying  the  size  of  the  local  domains 
to  reflect  model  validity.  In  this  work,  the  dimensions 
of  the  hyper-rectangles  around  a  center  point  design  are 
determined  such  that  model  predictions  match  test  data 
statistically  for  a  given  confidence  level.  Keeping  the 
assumption  that  test  data  can  be  obtained  for  each  local 
domain,  we  modeled  the  error  between  model 
predictions  and  test  data  as  a  Gaussian  process  and 
conducted  calibration-based  validation  using  a 
parametric  bootstrap  methodology  involving  maximum 
likelihood  estimators  of  unknown  model  parameters. 
We  formulated  a  design  optimization  problem  based  on 
the  "thermal  example"  of  Dowding  et  al.  (2008)  that 
was  developed  to  serve  as  a  validation  benchmark 
problem  for  a  validation  workshop  at  Sandia  National 
Laboratories.  We  demonstrated  that  i)  the  sequential 
process  generates  local  domains  of  different  dimensions 
and  ii)  that  the  active  constraints  (which  determine  the 
optimum)  move  around  the  design  space  as  the  models 
that  are  used  to  compute  them  are  re-calibrated  as 
necessary. 

While  the  Gaussian  process  assumption  is 
prevalent  in  the  literature  as  many  data  sets  exhibit 
Gaussian  behavior,  there  are  instances  where  this  is  not 
true.  Therefore,  in  Drignei  et  al.  (2012b),  we  used  a 
nonparametric  bootstrap  statistical  method  that  does  not 
rely  on  the  assumption  of  Gaussian  (or  any  other) 
distribution.  We  applied  the  nonparametric 
bootstrapping  technique  to  the  same  thermal  example. 
The  obtained  results  differed,  but  not  substantially. 
Nevertheless,  the  ability  to  not  have  to  rely  on  the 
Gaussian  (or  any  other)  distribution  may  be  crucial  for 
other  design  applications. 

In  this  paper,  we  lift  the  assumption  that  a  fixed 
number  of  tests  has  been  conducted  to  generate  data  for 
each  local  domain.  In  order  to  utilize  resources  as 
efficiently  as  possible,  we  assume  that  only  one  set  of 
test  data  exists  (or  can  be  obtained)  for  the  center  point 
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design  of  each  new  local  domain,  and  present  a 
technique  for  conducting  additional  tests  incrementally 
(one  by  one)  if  and  as  necessary  to  validate  the  model 
via  calibration,  so  that  it  satisfies  a  user-specified 
accuracy  threshold.  We  demonstrate  our  updated 
methodology  for  sequential  design  optimization  and 
validation  using  a  piston  design  example,  which 
involves  statistical  surrogates  for  the  responses  of 
interest. 

3.  TEST  VARIABILITY  AND  LOCAL  DOMAIN 
SIZING 

In  this  section,  we  first  review  the  statistical 
framework  to  model  the  error  between  test  data  and 
model  predictions.  For  convenience  of  notation,  since 
the  design  example  we  use  considers  only  scalar 
responses,  we  will  consider  the  case  where  responses 
are  spatially  invariant  and  time-independent.  A 
formulation  for  time -dependent,  spatially  variant 
responses  is  presented  in  Drignei  et  al.  (2012b).  We 
then  proceed  with  presenting  the  technique  for  linking 
test  variability  with  domain  size  to  determine  the 
number  of  required  tests  in  order  to  satisfy  local  domain 
model  validity  requirements. 

3.1  Statistical  error  modeling 

Letyf(d.p)  and  ym(d,p,c(d,p))  be  system 
responses  as  measured  in  a  test  and  predicted  by  a 
computational  model,  respectively,  where  d  e  di"'  are 
design  variables,  pe5l"'  are  design  parameters,  and 
c  e  are  model  calibration  parameters.  We  assume 
that  several  tests  y[(x),  k  =  l,...,r  are  available  at  a 

point  x  =  (d,p)e  ,  with  their  exact  number  r  to 

be  determined  later.  We  can  then  write 

y{  (x) = ym  (x’ c) + <KX) + £k  (x)  >  (!) 

where  d'fx)  is  the  prediction  bias  representing  the  mean 
simulation  model  error,  and  (x)  is  a  zero  mean 
random  quantity  representing  the  variability  between  a 
specific  test  response  (fixed  k)  and  a  specific  model 
prediction  (fixed  c).  Here,  for  simplicity  reasons,  we 
will  not  include  the  model  bias  explicitly.  However,  a 
bias  term  (random  or  not)  can  be  easily  included 
(Drignei  et  al.,  2012b;  Bayarri  et  al.,  2007;  Kennedy 
and  O’Hagan,  2001),  should  there  be  strong  evidence 
for  it  in  specific  applications.  The  above  statistical 
model  simplifies  to 

y‘k(x)=ym(x’c)+£k(x)-  (2) 

Since  there  can  be  several  responses  of  interest,  we 
denote  by  yf(x)  the  vector  of  test  responses,  by 
ym(x,c)  the  vector  of  model  responses  and  by  g(x)  the 


error  vector.  Due  to  measurement  error  and 
uncontrollable  testing  conditions,  the  test  responses 

yr(x)  may  be  variable.  In  order  to  capture  the  error 
variability  around  the  mean  statistically,  we  model  c(x) 
as  a  Gaussian  process  with  mean  vector  zero  and 
covariance  matrix  T .  The  latter  may  depend  on 
statistical  parameters  0  that  also  need  to  be  estimated. 
If  q>  =  (c,0)  includes  the  model  calibration  parameters  c 
and  the  statistical  parameters  0 ,  the  Gaussian 
probability  density  function  we  consider  is  (Schervish, 
1995) 

where  N  is  the  length  of  the  test  data  vector  yr(x).  For 
simplification,  the  covariance  matrix  is  chosen  as 
r  =  r2I  where  I  is  the  identity  matrix.  Therefore, 
0  =  t2  in  this  paper. 


3.2  Determining  the  local  domain 

When  the  point  x  is  fixed,  the  parameter  vector  cp 
is  estimated  by  maximizing  the  likelihood  function 
/.  |  /;^x  /  E/([1  (l(x  ^  f/>^  where  /e /®  (f(x)l^)  is 

provided  by  Equation  (3). 

Test  data  yf  (x)  at  point  x  are  used  to  determine  the 
size  of  the  local  domain  around  x,  by  extending  the 
likelihood  function  to  include  x  as  an  additional 
statistical  parameter;  i.e.,  L  </;|/;(xJj)-  We  can  then 

obtain  estimated  values  (x,<p)  of  the  parameters  (x,(p) 
by  maximizing  the  extended  likelihood  function  Zext. 
This  is  accomplished  by  generating  error  data  (x) 
from  the  Gaussian  distribution  with  zero  mean  and 
variance  r2,  and  re-estimating  (i.e.,  re-calibrating)  the 
model  by  maximizing  the  extended  likelihood  function 
Text-  This  process  is  repeated  (e.g.,5  =  50  times)  to 
obtain  the  parametric  bootstrap  finite  sample 
{xb},b  =  l,...,B  of  statistical  copies  ofx.  Then  the 
local  domain  will  be  an  ellipse  including  those  design 
points  x  that  satisfy 


r 

x.—x 


s . 

V  Z 


^  Zka 


where 


is  the  i 


•th 


component  of  x  ,  xt  is  the  sample  mean  of  xb  ,• ,  st  is 


'y 

the  sample  standard  deviation  of  xb  and  %K  ]-«  is  a 

critical  value  of  the  chi-square  distribution  with  nj+np 
degrees  of  freedom  (the  number  of  components  of  x ). 

'y 

For  example,  for  nd+np  =2  and  a  =  0.01  Xk,\-ci  =9.21 

based  on  the  table  of  chi-square  critical  values.  This  is 
an  established  statistical  method  to  obtain  exact 
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confidence  regions  in  multidimensional  cases 
(Schervish,  1995).  In  our  previous  work  we  used 
rectangular  shapes  for  the  local  domains  to  expedite  the 
computational  process.  The  resulting  confidence  level 
was  thus  only  approximate. 

3.3  Determining  the  number  of  required  tests  for 
each  local  domain 

In  our  previous  work,  we  assumed  that  the  model  is 
validated  through  calibration  in  a  local  domain  D 
uniformly,  i.e.,  that  the  model  is  valid  at  each  point 
inside  the  local  domain.  However,  model  validity  may 
vary  at  different  points  of  the  local  domain,  with  points 
farther  away  from  its  center  (where  test  data  are 
available)  possibly  having  lower  validity.  This  behavior 
stems  from  the  Gaussian  asymptotic  distribution  of  the 
maximum  likelihood  estimators  and  the  generalized 
least  squares  estimators  that  were  used  in  our 
parametric  and  non-parametric  bootstrap  approaches, 
respectively.  Moreover,  we  need  to  account  for  test 
variability  while  sizing  the  local  domains  (in  our 
previous  work  we  assumed  that  data  are  available  from 
a  fixed  number  of  tests). 

We  propose  therefore,  an  enhanced  calibration- 
based  validation  procedure  that  is  based  on  the  distance 
between  the  point  at  which  we  have  test  data  (the  center 
of  the  local  domain)  and  the  point  at  which  we  want  to 
utilize  the  model.  This  enhanced  technique  is  based  on 
the  assumption  that  test  and  model  data  are  statistically 
similar  at  the  center  of  the  local  domain,  while  this 
statistical  similarity  decreases  as  the  distance  from  the 
center  point  increases. 

We  define  a  pointwise  validation  measure  at  x  as 
the  correlation  between  model  data  at  the  center  xc  of 
the  domain  and  model  data  at  point  x.  This  could  be  an 
empirical  correlation,  or  a  more  sophisticated  model 
correlation.  In  this  paper,  we  use  the  surrogate  model 
correlation 

V(x)  =  C(x,xc),  (4) 

which  is  described  in  detail  in  the  next  section.  This 
correlation  inherently  depends  on  the  distance  between 
the  two  points  x  and  xc.  Furthermore,  we  define  the 
validation  level  of  the  local  domain  D  as 

VL  =  minU(x) .  (5) 

xe£> 

It  expresses  a  minimally  acceptable  model  correlation 
between  data  at  the  center  and  data  within  the  local 
domain.  This  new  validation  measure  is  a 
generalization  and  a  refinement  of  our  previous 
dichotomous  validation  measure,  where  the  model  at  a 
validating  design  point  was  either  validated  or  not, 
depending  on  whether  it  was  included  in  the  local 
domain  or  not. 

Using  statistical  arguments,  one  can  demonstrate 
that  if  the  number  of  tests  at  the  center  design  point 
increases  from  1  to  r,  the  local  domain  shrinks,  and  its 
size  is  expected  to  decrease  by  a  factor  of  order  r.  This 


follows  from  the  asymptotic  distribution  of  maximum 
likelihood.  Therefore,  by  increasing  the  number  of  tests, 
we  obtain  a  local  domain  that  is  more  strongly  validated 
according  to  the  proposed  validation  level  of  Equation 
(5)  because  all  its  points  are  closer  to  the  center  design 
point  where  the  test  data  are  available.  However, 
increasing  the  number  of  tests  also  increases  the  cost. 
Thus,  there  is  a  trade-off  between  the  cost  and  the 
degree  of  model  validation  of  the  local  domain.  We 
propose  the  following  algorithm  for  choosing  the 
number  of  tests  r  in  each  local  domain: 

1.  Start  with  a  minimum  number  of  tests  r0  and 
obtain  the  local  domain  using  high  confidence 
level,  for  example  99%.  In  this  local  domain, 
compute  the  validation  level  of  Equation  (5). 

2.  If  the  validation  level  is  above  a  high  pre- 

established  threshold  (e.g.  0.99),  stop  the 

algorithm  and  choose  r=r0. 

3.  If  the  validation  level  is  below  the  threshold, 
increase  the  number  of  tests  r  by  one  and  repeat 
steps  1  and  2  above. 

In  general,  the  confidence  level  used  to  construct  the 
local  domains  does  not  need  to  be  equal  to  the 
validation  level  threshold. 

3.4  Design  optimization 

Ultimately,  we  are  interested  in  optimizing  a 
design  objective  subject  to  constraints  where  the 
objective  and  constraints  are  evaluated  using  model 
predictions  for  the  responses  of  interest.  A  piston 
design  optimization  example  is  presented  in  the  next 
section.  The  following  steps  summarize  the  proposed 
sequential  design  optimization  approach  with 
concurrent  calibration-based  model  validation  using 
elliptic  local  domains  whose  size  is  determined  by 
accounting  for  test  variability  using  a  parametric 
bootstrap. 

1.  Choose  an  initial  point  x^°\  where  both  test 
data  and  model  predictions  are  generated. 

2.  Calibrate  the  computer  ( prediction )  model  at  the 
current  point  using  maximum  likelihood 
estimation,  parametric  bootstrap  and  an  initial 
number  r0  of  test  data. 

3.  Obtain  the  elliptic  local  domain  surrounding  this 
point. 

4.  Compute  the  validation  level  of  the  current  local 
domain.  If  it  is  below  a  specified  threshold, 
conduct  an  additional  test,  go  back  to  Step  3  and 
compute  a  new  elliptic  local  domain  using  the 
additional  test  data.  If  it  satisfies  the  validation 
level  threshold  go  to  Step  5. 

5.  Solve  the  design  optimization  problem  in  the 
local  domain  defined  at  Step  3. 
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6.  If  the  optimal  design  obtained  at  Step  5  is  on  the 
boundary  of  the  local  domain,  go  to  Step  2. 
Otherwise,  stop. 

4.  EXAMPLE:  PISTON  DESIGN 

Automotive  piston  design  is  a  challenging  problem 
involving  complex  physics  and  requiring  satisfaction  of 
multiple  performance  objectives.  Efforts  to  reduce 
engine  vibration  and  radiated  noise  while  improving 
durability,  fuel  economy  and  reliability  are  critical. 
Important  piston  design  requirements  include  high  scuff 
resistance,  low  friction  and  wear,  and  low  noise.  It  is 
common  practice  to  determine  the  optimal  piston  skirt 
profile  so  that  the  maximum  contact  pressure  at  the 
piston-bore  interface  is  kept  low  in  order  to  avoid 
scuffing.  Reduced  friction  and  low  piston  slap  noise  are 
used  as  additional  design  requirements. 

Despite  the  small  clearance  between  the  piston  and 
the  cylinder  bore,  the  piston  undergoes  a  secondary 
motion  within  the  bore,  which  can  generate  unwanted 
sound  and  vibration.  The  secondary  motion  is  caused 
by  the  piston  side  thrust,  which  changes  direction  due  to 
the  connecting  rod  angularity.  As  a  result,  the  piston 
moves  laterally  from  one  side  of  the  bore  to  the  other 
and  also  tilts  within  the  bore  clearance.  It  eventually 
collides  against  the  cylinder  wall.  This  impact  is 
commonly  known  as  piston  slap. 

To  improve  the  friction,  noise  and  scuffing 
performance,  the  piston  is  designed  with  barrel  and 
ovality  shapes.  Because  of  the  barrel  shape,  the  side 


load  on  the  piston  skirt  gets  distributed  more  evenly 
resulting  in  a  much  wider  contact  pattern  and  lower 
skirt  pressure  improving  the  piston’s  scuffing 
propensity.  The  piston  ovality  also  improves  the 
piston’s  scuffing  propensity  and  also  provides 
additional  clearance  to  parts  of  the  skirt  to  compensate 
for  the  piston’s  thermal  expansion.  The  piston  slap 
cannot  be  eliminated  completely  although  it  can  be 
significantly  reduced  with  a  good  piston  skirt  profile. 
The  combination  of  a  deformed  cylinder  with  a 
deformed,  profiled  piston  (oval  and  barrel  shapes) 
defines  the  clearance  at  running  conditions  affecting 
therefore,  the  performance  of  the  piston. 

An  analytical  model  was  developed  by  Patel  et  al. 
(2010)  and  Paras  et  al.  (2007)  to  simulate  the  piston 
secondary  dynamics  and  piston-bore  contact  for  an 
asymmetric  half  piston  model.  The  model  includes 
several  important  physical  attributes  such  as  bore 
distortion  due  to  mechanical  and  thermal  deformation, 
inertia  loading,  piston  barrelity  and  ovality,  piston 
flexibility  and  skirt-to-bore  clearance.  The  model 
accounts  for  piston  kinematics,  rigid-body  dynamics 
and  skirt  flexibility. 

Based  on  a  sensitivity  study  (Patel,  2009)  seven 
key  design  variables  were  identified  as  the  biggest 
contributors  on  piston  friction,  noise  and  scuffing. 
Table  1  provides  the  lower  and  upper  bound  for  each 
design  variable.  All  variables  are  assumed  normally 
distributed.  Table  1  also  provides  their  standard 
deviation. 


Table  1.  Piston  design  variables 


Design  variables 

Units 

Lower 

bound 

Upper  bound 

Standard 

deviation 

1 

Pin  offset 

mm 

-1 

1 

0.1 

2 

Cold  clearance 

Jim 

0 

0.025 

0.005 

3 

Clearance  at  section  A2 

(im 

0.1 

0.2 

0.05 

4 

Ovality  at  section  A2 

- 

0.4 

0.9 

0.1 

5 

Barrelity  at  section  A2 

- 

1.5 

4.0 

0.2 

6 

Cylinder  bore  distortion 

deg 

60 

120 

10 

7 

Coefficient  of  friction 

- 

0.05 

0.2 

0.05 

Figure  2  shows  some  piston  design  details.  From  the 
front  view,  the  piston  has  a  barrel  shape  which  allows 
different  level  of  clearance  at  different  sections.  The 
piston  skirt  is  divided  into  three  main  sections;  top  section 
Ao ,  middle  section  Al ,  and  bottom  section  A  .  The 

clearance  between  the  skirt  and  the  liner  in  the  middle 
section  represents  the  cold-clearance.  The  oval  shape  is 
defined  at  three  sections  along  the  piston  skirt  similarly  to 
the  piston  barrelity.  Because  of  its  criticality,  the  piston 


ovality  at  the  top-section  is  used  as  a  design  variable. 
Finally,  the  piston  pin  offset  and  the  cold  clearance  are 
also  design  variables.  The  pin  offset  is  the  distance 
between  the  piston  centerline  to  the  piston  pin-hole 
centerline. 
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Figure  2.  Piston  geometric  dimensions 


Here,  the  design  vector  x  consists  of  two  design 
variables  (pin  offset  and  cold  clearance),  whereas  the 
calibration  vector  c  includes  the  remaining  five  variables 
in  Table  1.  Denoting  by  /i(x,c)  the  skirt  pressure,  by 
gi(x,c)  the  total  friction  power  loss  and  by  g2(x,c)  the 
kinetic  energy  loss,  the  design  optimization  problem  is 
formulated  as 

min  fXx,  c) 

X 

subject  to  g,(x,c)  (6) 

g2(x,c)<7850. 

The  model  response  vector  in  this  example  is  thus  given 
by  y (x, c)  =  [f\  (x, c),  gl  (x,  c),  g2  (x,  c)f . 

4.1  Model  surrogates 

The  models  developed  by  Patel  et  al.  (2010)  and 
Paras  et  al.  (2007)  for  simulating  the  piston  motion  and 
computing  the  responses  fh  and  g2  are  computationally 
expensive  and  thus  prohibitive  for  optimization. 
Therefore,  following  established  methodology  on  design 
and  analysis  of  computer  experiments  (Santner  et  al., 
2003;  Fang  et  al.,  2006),  we  constructed  separate  scalar 
surrogates  for  each  response  using  data  at  100  points  of 
the  7-dimensional  input  space  chosen  according  to  a 
maximin  design  criterion  (Morris  et  al.,  1993).  We 
assumed  a  Gaussian  distribution  for  each  surrogate  y,  with 


constant  mean  //  and  Gaussian  correlation  (Currin  et  al., 
1991) 

7 

C(Xj,Xj)  =  cr2exp[-^6>A-(jfiv- -xkj)2], 
k= 1 

for  ij  =1,...,  100,  where  xk,i  are  input  coordinates  scaled 
to  the  interval  [0,1],  and  k=  1,...,7.  The  unknown 

parameters  \ are  estimated  by  maximum 
likelihood  estimation.  Denote  by  C  the  matrix  with 

elements  C(Xj,Xj)  anc|  by  y  the  vector  of  model  data. 

The  surrogate  at  a  new  input  of  coordinates  xk, o  is 
chosen  to  be  the  conditional  mean 

y  =  ju+ C0rC_1  (Y  -  /A),  (7) 

2  T  —1 

with  the  conditional  variance  a~(\  —  Cq  C  Cq)  for  the 
surrogate  error.  In  Equation  (7),  the  vector  C()  has 
components 

7 

C0(i)  =  a 2  exp [ 6k (xk  i  -  xk  Q)2\  for  i  =  1,...J00  and 
k= 1 

1  represents  a  unit  vector. 

In  the  rest  of  the  paper  we  replace  the  model  data 
ym(x,c)  with  surrogates  ym(x,c)  throughout  the  input 
space. 

The  surrogate  surfaces  may  be  characterized  by 
artificial,  local  optima.  As  the  results  below  will  show, 
this  may  be  a  potential  drawback  in  using  surrogates  in 
optimization  processes.  One  can  decrease  the  potential  of 
such  artificial  local  optima  by  increasing  the  sample  of 
design  points,  at  the  expense  of  a  more  computationally 
intensive  and/or  numerically  unstable  procedure. 
Alternatively,  starting  the  optimization  process  from 
different  initial  conditions  may  lead  to  the  global 
optimum. 

Table  2  presents  the  maximum  likelihood  estimated 
statistical  parameters  for  the  three  surrogates.  Figure  1 
shows  the  cross-validation  plots,  indicating  that  the 
surrogates  are  acceptably  accurate.  The  “leave-one-out” 
cross-validation  method  has  been  used.  The  plots  in 
Figure  1  show  the  actual  model  response  data  versus  the 
cross-validation  surrogate  model  response  data  at  all  100 
points.  In  such  plots,  the  closer  the  scatter-plot  is  to  the 
main  diagonal,  the  better  the  statistical  surrogate  is. 


Table  2.  Surrogate  maximum  likelihood  estimates 


Surrogate 

response 

Oi 

0, 

o. 

0, 

0, 

^7 

cr 

Skirt  pressure 

37.1286 

18.2107 

0.1692 

0.6668 

39.6980 

0.2646 

0.2141 

0.0368 

12.6683 

Friction  power  loss 

2.0366 

0.0043 

0.6496 

0.1698 

0.4760 

1.7270 

3.0177 

4.6468 

1.3090 

Kinetic  energy  loss 

8519 

0.1070 

0.0000 

0.2302 

0.0000 

15.9649 

57.2604 

0.2637 

1362 

6 


Figurel.  Surrogate  cross-validation  results 


Pin  offset 


Figure  2a.  Optimization  results  with  variable  number  of  tests  for  initial  guess:  pin  offset  =  -0.4,  cold  clearance  =  0.01 
objective  iso-contours  in  yellow,  friction  power  loss  constraint  boundaries  in  green  and  kinetic  energy  loss  constraint 

boundaries  in  blue 


Pin  offset 

Figure2b.  A  zoom-in  of  Figure  2a,  illustrating  the  upper  part  of  the  optimization  process;  objective  iso-contours  in 
yellow,  friction  power  loss  constraint  boundaries  in  green  and  kinetic  energy  loss  constraint  boundaries  in  blue 
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Table  3.  Summary  of  optimization  results  with  variable  number  of  tests  for  initial  guess:  pin  offset  =  -0.4,  cold 

clearance  =  0.01 


Step 

# 

Ellipse  semiaxis 
length 

„  u 

r 

Validation 

level 

Clearance 
at  A2 
section 

Ovality  at 
A2  section 

Barrelity 
at  A2 
section 

Cylinder 

Bore 

Distortion 

Coeff.  of 

Friction 

Pin 

Offset 

optimum 

Cold 

clearance 

optimum 

0 

0.0030 

0.0021 

2 

0.9960 

0.1585 

0.6301 

2.7647 

87.6363 

0.1057 

-0.3977 

0.0112 

1 

0.0036 

0.0024 

2 

0.9944 

0.1524 

0.6290 

2.7705 

88.5236 

0.1014 

-0.3949 

0.0126 

2 

0.0039 

0.0024 

2 

0.9936 

0.1491 

0.6347 

2.7769 

88.8969 

0.0946 

-0.3921 

0.0143 

3 

0.0027 

0.0030 

2 

0.9905 

0.1503 

0.6346 

2.7722 

88.6851 

0.0968 

-0.3907 

0.0169 

4 

0.0027 

0.0021 

3 

0.9948 

0.1531 

0.6209 

2.7538 

88.0373 

0.1150 

-0.3888 

0.0180 

5 

0.0036 

0.0021 

5 

0.9954 

0.0498 

0.5966 

2.9990 

91.8587 

0.0602 

-0.3902 

0.0161 

6 

0.0039 

0.0021 

5 

0.9954 

0.1330 

0.6359 

2.7923 

89.9825 

0.0740 

-0.3869 

0.0170 

7 

0.0027 

0.0021 

5 

0.9955 

0.1409 

0.6420 

2.8061 

89.8938 

0.0793 

-0.3849 

0.0185 

8 

0.0052 

0.0027 

12 

0.9944 

0.1441 

0.6385 

2.7775 

89.1672 

0.0863 

-0.3809 

0.0201 

9 

0.0061 

0.0030 

5 

0.9912 

0.1209 

0.6286 

2.7690 

90.1270 

0.0723 

-0.3760 

0.0209 

10 

0.0134 

0.0027 

14 

0.9933 

0.1482 

0.6326 

2.7592 

88.6176 

0.1089 

-0.3637 

0.0215 

11 

0.0112 

0.0030 

20 

0.9928 

0.1496 

0.6302 

2.7541 

88.2910 

0.1097 

-0.3533 

0.0218 

12 

0.0018 

0.0018 

8 

0.9966 

0.1282 

0.6202 

2.7471 

89.8001 

0.1343 

-0.3519 

0.0218 

96 

0.0018 

0.0015 

2 

0.9978 

0.1508 

0.6428 

2.7446 

86.5974 

0.1124 

-0.1453 

0.0082 

97 

0.0036 

0.0012 

2 

0.9982 

0.1456 

0.6442 

2.7623 

85.3336 

0.1023 

-0.1450 

0.0081 

Table  4.  Summary  of  optimization  results  with  fixed  number  of  tests  for  initial  guess:  pin  offset  =  -0.4, 

cold  clearance  =  0.01 


Step 

# 

Ellipse  semiaxis 
length 

n  h 

r 

Validation 

level 

Clearance 
at  A2 
section 

Ovality  at 

A2  section 

Barrelity 
at  A2 
section 

Cylinder 

Bore 

Distortion 

Coeff.  of 

Friction 

Pin  Offset 
optimum 

Cold 

clearance 

optimum 

0 

0.0030 

0.0021 

2 

0.9960 

0.1585 

0.6301 

2.7647 

87.6363 

0.1057 

-0.3977 

0.0112 

1 

0.0036 

0.0024 

2 

0.9944 

0.1524 

0.6290 

2.7705 

88.5236 

0.1014 

-0.3949 

0.0126 

2 

0.0039 

0.0024 

2 

0.9936 

0.1491 

0.6347 

2.7769 

88.8969 

0.0946 

-0.3921 

0.0143 

3 

0.0027 

0.0030 

2 

0.9905 

0.1503 

0.6346 

2.7722 

88.6851 

0.0968 

-0.3907 

0.0169 

4 

0.0046 

0.0052 

2 

0.9767 

0.1490 

0.6358 

2.7780 

88.9326 

0.0940 

-0.3882 

0.0215 

5 

0.0452 

0.0061 

2 

0.9621 

0.1665 

0.6532 

2.9547 

90.2804 

0.1133 

-0.3448 

0.0217 

6 

0.1505 

0.0058 

2 

0.9122 

0.1783 

0.6597 

2.9875 

89.8286 

0.1098 

-0.2151 

0.0224 

7 

0.0455 

0.0067 

2 

0.9586 

0.1473 

0.6335 

2.8164 

89.8114 

0.0943 

-0.1677 

0.0208 

8 

0.0294 

0.0067 

2 

0.9656 

0.1555 

0.6332 

2.7942 

88.1026 

0.0928 

-0.1456 

0.0175 

9 

0.0167 

0.0070 

2 

0.9530 

0.1510 

0.6327 

2.7808 

87.5601 

0.0924 

-0.1352 

0.0138 

10 

0.0030 

0.0027 

2 

0.9925 

0.1519 

0.6335 

2.7838 

87.1252 

0.0929 

-0.1342 

0.0114 

11 

0.0039 

0.0024 

2 

0.9948 

0.1512 

0.6336 

2.7844 

87.7648 

0.0880 

-0.1326 

0.0098 

12 

0.0024 

0.0018 

2 

0.9965 

0.1490 

0.6322 

2.7757 

87.5562 

0.0931 

-0.1318 

0.0084 

48 

0.0003 

0.0012 

2 

0.9986 

0.1647 

0.6465 

2.8418 

87.1074 

0.0786 

0.0000 

0.0019 

49 

0.0003 

0.0009 

2 

0.9989 

0.1535 

0.6353 

2.8059 

89.0702 

0.0780 

0.0000 

0.0019 

8 


Figure  3.  Optimization  results  with  fixed  number  of  tests  for  initial  guess:  pin  offset  =  -0.4,  cold  clearance  =  0.01; 
objective  iso-contours  in  yellow,  friction  power  loss  constraint  boundaries  in  green  and  kinetic  energy  loss  constraint 

boundaries  in  blue 


Pin  offset 


Figure  4.  Optimization  results  with  variable  number  of  tests  for  initial  guess:  pin  offset  =  -0.64,  cold  clearance  =  0.02 
objective  iso-contours  in  yellow,  friction  power  loss  constraint  boundaries  in  green  and  kinetic  energy  loss  constraint 

boundaries  in  blue 
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Figure  5.  Optimization  results  with  fixed  number  of  tests  for  initial  guess:  pin  offset  =  -0.64,  cold  clearance  =  0.02; 
objective  iso-contours  in  yellow,  friction  power  loss  constraint  boundaries  in  green  and  kinetic  energy  loss  constraint 

boundaries  in  blue 

Table  5.  Summary  of  results  as  a  function  of  number  of  tests  necessary  to  obtain  the  first  local  domain  when 
algorithm  starts  at  pin  offset  =  -0.35  and  cold  clearance  =  0.022. 


r 

Ellipse  semiaxis 

Validation 

Clearance 

Ovality  at 

Barrelity 

Cylinder 

Coeff.  of 

length 

„  h 

level 

at  A2 
section 

A2  section 

at  A2 
section 

Bore 

Distortion 

Friction 

2 

0.0256 

0.0024 

0.9473 

0.1362 

0.6258 

2.7579 

89.0112 

0.0798 

3 

0.0153 

0.0024 

0.9475 

0.1438 

0.6284 

2.7790 

89.2401 

0.0860 

4 

0.0116 

0.0022 

0.9553 

0.1408 

0.6278 

2.7728 

89.1668 

0.0802 

5 

0.0053 

0.0021 

0.9613 

0.1398 

0.6275 

2.7669 

88.9502 

0.0763 

6 

0.0078 

0.0021 

0.9591 

0.1354 

0.6274 

2.7658 

89.1418 

0.0711 

7 

0.0073 

0.0023 

0.9534 

0.1474 

0.6319 

2.7846 

88.8173 

0.0713 

8 

0.0068 

0.0017 

0.9744 

0.1591 

0.6361 

2.7999 

88.1488 

0.0706 

9 

0.0074 

0.0018 

0.9710 

0.1589 

0.6368 

2.8016 

87.8803 

0.0633 

10 

0.0118 

0.0018 

0.9725 

0.1417 

0.6304 

2.7789 

88.9059 

0.0643 

11 

0.0069 

0.0011 

0.9882 

0.1541 

0.6364 

2.8116 

88.9887 

0.0631 

12 

0.0042 

0.0015 

0.9794 

0.1558 

0.6370 

2.8125 

88.8671 

0.0630 

13 

0.0035 

0.0014 

0.9836 

0.1421 

0.6309 

2.7857 

89.0676 

0.0618 

14 

0.0061 

0.0019 

0.9714 

0.1435 

0.6300 

2.7742 

88.1697 

0.0589 

15 

0.0050 

0.0016 

0.9764 

0.0933 

0.6206 

2.7298 

90.5046 

0.0585 

16 

0.0062 

0.0017 

0.9734 

0.1264 

0.6270 

2.7848 

90.0593 

0.0628 

17 

0.0021 

0.0013 

0.9862 

0.1442 

0.6331 

2.8202 

89.8644 

0.0647 

18 

0.0025 

0.0011 

0.9907 

0.1333 

0.6273 

2.7825 

89.6645 

0.0650 

4.2  Design  optimization  results 

Figure  2  and  Table  3  summarize  the  results  of  the 
optimization  process,  implementing  the  technique 

presented  in  Section  3.  Test  data  yf(x)  were  simulated 

from  the  Gaussian  distribution  with  mean  y'"(x,c)  and 

standard  deviation  r  =  0. 1 .  The  initial  guess  used  for 
optimization  was  pin  offset  =  -0.4  and  cold  clearance  = 
0.01.  The  optimization  within  each  local  domain  started  at 


the  center  of  the  ellipse.  Objective  function  iso-contours 
are  shown  in  yellow  only  for  the  last  step  of  the 
optimization  process.  Constraint  boundaries  are  shown  in 
green  and  blue  for  the  friction  power  and  kinetic  energy 
losses,  respectively.  The  constraint  boundaries  "move" 
about  the  design  space  as  the  sequential  optimization 
process  progresses  due  to  the  re -calibration  of  the  models 
as  necessary. 
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Due  to  the  many  steps  of  the  sequential  optimization 
process.  Table  3  includes  only  the  first  13  and  the  last  2 
steps.  The  initial  number  of  tests  for  each  local  domain  is 
r0= 2.  One  can  see  that  after  the  first  4  steps  the  number  of 
tests  per  local  domain  increases  to  achieve  the  required 
high  validation  level.  Towards  the  end  of  the  sequential 
optimization  process  the  validation  level  is  high  enough 
that  no  additional  tests  are  required,  so  the  last  local 
domains  are  based  on  r0=  2  tests.  Figure  3  and  Table  4 
show  results  for  fixed  number  of  tests  (r0= 2)  at  each  local 
domain  of  the  sequential  optimization  process.  One  can 
see  that  the  local  domains  in  the  middle  stage  of  the 
sequential  optimization  process  are  larger.  Therefore,  the 
points  located  near  the  boundaries  of  the  local  domains 
are  assumed  to  be  not  as  valid  as  the  points  closer  to  the 
center  of  the  ellipse,  where  test  data  are  available.  These 
ellipses  are,  in  fact,  much  larger  than  those  towards  the 
middle  stage  of  the  optimization  process  in  Figure  2a. 
This  was  to  be  expected,  since  additional  tests  will 
decrease  the  area  of  the  local  domains.  The  validation 
levels  reported  in  Tables  3  and  4  were  computed  using 
1000  sample  points  within  each  local  domain. 

Not  using  the  new  method  that  controls  the  size  of 
the  local  domain  may  result  in  obtaining  different  optimal 
designs.  Indeed,  the  optimal  design  in  Table  3  is  pin 
offset=-0.1450  and  cold  clearance  =  0.0081,  whereas  the 
optimal  design  in  Table  4  is  pin  offset=0.0000  and  cold 
clearance  =  0.0019.  A  pin  offset  value  very  close  to  0  is 
not  meaningful  from  an  engineering  point  of  view. 

We  have  replicated  the  optimization  process  several 
times  using  the  same  initial  guess.  The  results  differ 
slightly  due  to  the  different  sets  of  random  numbers. 
While  there  are  differences  among  statistically  replicated 
optimization  paths,  the  ones  using  the  new  method  tend  to 
converge  to  an  optimal  design  farther  away  from  the 
origin  than  those  not  using  it.  The  possible  cause  of  this 
behavior  is  the  qualitative  change  in  the  optimization 
surface  around  the  objective  function  iso-contour  of  40 
MPa.  Specifically,  this  line  appears  to  separate  the  area 
including  the  local  maximum  associated  with  the  contour 
line  65  MPa  from  the  local  minimum  associated  with  the 
contour  line  25  MPa.  In  general,  one  would  like  to  have  a 
more  conservative  optimization  method  that  uses  small 
steps  in  such  areas  of  qualitative  changes.  The 
optimization  based  on  the  presented  method  seems  to 
naturally  reduce  the  size  of  local  domains  in  this  area, 
whereas  the  method  used  in  our  previous  work  did  not. 

Figures  4  and  5  depict  what  happens  when 
considering  a  different  initial  guess  for  the  sequential 
design  optimization  process,  namely  pin  offset  =  -0.64, 
cold  clearance  =  0.02  where  the  optimization  within  each 
local  domain  started  at  the  extreme  left  end  of  each 
ellipse.  Using  the  enhanced  technique  that  accounts  for 
test  variability  when  sizing  the  local  domain  (new 
validation  measure),  the  final  optimal  values  are  similar  to 
those  obtained  using  the  previous  initial  guess  (Figure  4). 
This  is  not  the  case  when  the  number  of  tests  is  held  fixed 


for  each  local  domain.  In  this  case,  the  optimization 
process  converges  to  an  inferior  local  optimum  (Figure  5). 
Even  though  this  may  have  been  caused  by  the  surrogates, 
the  enhanced  technique  did  not  get  trapped  in  this  local 
optimum.  We  conclude  that  varying  the  local  domain  size 
according  to  test  variability  may  have  advantages 
regarding  the  quality  of  the  obtained  optimal  designs, 
even  though  this  may  come  at  the  expense  of  additional 
tests. 

To  provide  insight  on  how  the  validation  level 
changes  with  the  number  of  tests,  we  started  the 
optimization  algorithm  at  a  new  initial  design  with  pin 
offset  =  -0.35  and  cold  clearance  =  0.022  and  stopped  the 
algorithm  after  the  first  local  domain  was  identified.  A 
number  of  r  =  18  tests  were  necessary  to  achieve  the 
prescribed  validation  level  of  0.99.  Figure  6  shows  the 
validation  level  as  the  number  of  tests  varies  from  2  to  18. 
As  expected,  adding  more  tests  increases  the  validation 
level  although  the  increase  is  not  monotonic.  Table  5 
provides  more  details  for  the  18  tests  showing  that  in 
general  the  size  of  the  ellipses  decreases  with  the  number 
of  tests.  It  should  be  noted  that  each  row  in  Tables  3  and  4 
show  only  the  last  row  of  Table  5  for  each  identified  local 
domain. 


Figure  6.  Validation  level  versus  number  of  tests  to 
obtain  the  first  local  domain  for  an  initial  pin  offset  = 
-0.35  and  cold  clearance  =  0.022. 

5.  SUMMARY 

In  our  recently  developed  methodology  for  sequential 
design  optimization  with  concurrent,  calibration-based 
model  validation,  the  number  of  tests  to  obtain  data  was 
assumed  constant  for  each  local  domain.  Recognizing  that 
test  variability  can  play  a  significant  role  in  the  statistical 
error  models  we  employ,  we  developed  a  new  method  that 
accounts  for  it  when  determining  the  size  of  local 
domains  within  which  we  can  assume  that  the  models 
used  to  obtain  objective  and  constraint  function  values  are 
sufficiently  valid. 

Using  an  automotive  piston  design  example,  we  have 
demonstrated  again  that  local  model  calibration  is 
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necessary  to  ensure  that  the  sequence  of  design 
optimization  candidates  is  validated.  Moreover,  we  have 
shown  that  test  variability  can  be  accounted  for  as 
necessary  by  varying  the  size  of  the  local  domains  to 
ensure  that  validity  thresholds  are  satisfied.  Finally,  for 
the  particular  example  considered  in  this  paper,  we 
observe  that  optimal  designs  obtained  using  the  proposed 
method  are  superior  to  these  obtained  without  using  it. 
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